Fgk Model: an Efficient Granular Computing Model for Protein Sequence Motifs Information Discovery

نویسندگان

  • Bernard Chen
  • Phang C. Tai
  • Robert Harrison
  • Yi Pan
چکیده

Discovering protein sequence motif information is one of the most crucial tasks in bioinformatics research. In this paper, we try to obtain protein recurring patterns which are universally conserved across protein family boundaries. In order to achieve the goal, our dataset is extremely large. Therefore, an efficient technique is required. In this article, short recurring segments of proteins are explored by utilizing a granular computing strategy. First, Fuzzy C-Means clustering algorithm (FCM) is applied to separate the whole dataset into several smaller information granules and then followed by a novel greedy initialization OF K-means clustering algorithm on each granule to obtain the final results. A new evaluation method for sequence motif information, based on the function of the HSSP and the BLOSUM62 matrix, is also proposed. Compared with the existing IEEE Trans. research results, our method requires only one fifth of the execution time and shows better results in all three different quality measures.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Novel efficient granular computing models for protein sequence motifs and structure information discovery

Protein sequence motifs have the potential to determine the conformation, function and activities of the proteins. In order to obtain protein sequence motifs which are universally conserved across protein family boundaries, unlike most popular motif discovering algorithms, our input dataset is extremely large. As a result, an efficient technique is demanded. We create two granular computing mod...

متن کامل

Efficient Super Granular SVM Feature Elimination (Super GSVM-FE) model for protein sequence motif information extraction

Protein sequence motifs are gathering progressively attention in the sequence analysis area. The conserved regions have the potential to determine the conformation, function and activities of the proteins. We develop a new method combines the concept of granular computing and the power of Ranking-SVM to further extract protein sequence motif information generated from the FGK model. The quality...

متن کامل

Protein Sequence Motif Detection using Novel Rough Granular Computing Model

Protein sequence motifs information is essential for the analysis of biologically significant regions. Discovering sequence motifs is a key task to realize the connection of sequences with their structures. Protein sequence motifs have the potential to determine the function and activities of the proteins. Many algorithms or techniques are used to determine motifs which require a predefined fix...

متن کامل

Discovery and Extraction of Protein Sequence Motif Information that Transcends Protein Family Boundaries

Protein sequence motifs are gathering more and more attention in the field of sequence analysis. The recurring patterns have the potential to determine the conformation, function and activities of the proteins. In our work, we obtained protein sequence motifs which are universally conserved across protein family boundaries. Therefore, unlike most popular motif discovering algorithms, our input ...

متن کامل

Exploring Highly Structure Similar Protein Sequence Motifs using SVD with Soft Granular Computing Models

Vital areas in Bioinformatics research is one of the Protein sequence analysis. Protein sequence motifs are determining the structure, function, and activities of the particular protein. The main objective of this paper is to obtain protein sequence motifs which are universally conserved across protein family boundaries. In this research, the input dataset is extremely large. Hence, an efficien...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006